Language design for distributed stream processing

نویسنده

  • Ryan Newton
چکیده

Applications that combine live data streams with embedded, parallel, and distributed processing are becoming more commonplace. WaveScript is a domain-specific language that brings high-level, type-safe, garbage-collected programming to these domains. This is made possible by three primary implementation techniques, each of which leverages characteristics of the streaming domain. First, WaveScript employs an evaluation strategy that uses a combination of interpretation and reification to partially evaluate programs into stream dataflow graphs. Second, we use profile-driven compilation to enable many optimizations that are normally only available in the synchronous (rather than asynchronous) dataflow domain. Finally, an empirical, profile-driven approach also allows us to compute practical partitions of dataflow graphs, spreading them across embedded nodes and more powerful servers. We have used our language to build and deploy applications, including a sensor-network for the acoustic localization of wild animals such as the Yellow-Bellied marmot. We evaluate WaveScript’s performance on this application, showing that it yields good performance on both embedded and desktop-class machines. Our language allowed us to implement the application rapidly, while outperforming a previous C implementation by over 35%, using fewer than half the lines of code. We evaluate the contribution of our optimizations to this success. We also evaluate WaveScript’s ability to extract parallelism from this and other applications. Thesis Supervisor: Samuel Madden Title: Associate Professor Thesis Supervisor: Arvind Title: Johnson Professor

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Change-Resilient Design and Dataflow Optimization for Distributed XML Stream Processors

We propose a new stream-processing framework based on a virtual assembly line (val) model. We instantiate the val framework obtaining ∆-XML, an approach for designing and optimizing distributed XML processing pipelines. val/∆-XML greatly simplifies the design of change-resilient dataflow pipelines: XML processors (called actors) can be inserted, deleted, and their “scope of work” (the parts of ...

متن کامل

Stream Processing on the Grid: an Array Stream Transforming Language

Specific requirements of stream processing on the Grid are discussed. We argue that when the stream processing paradigm is used for cluster computing, the processing components can be coded in the form of data-parallel recurrence relations with stream synchronization and filtering at the interfaces. We propose a programming language ASTL in which such components can be written and describe some...

متن کامل

Distributed S-Net

S-NET is a declarative coordination language and component technology primarily aimed at modern multicore/many-core chip architectures. It builds on the concept of stream processing to structure dynamically evolving networks of communicating asynchronous components, which themselves are implemented using a conventional language suitable for the application domain. We sketch out the design and i...

متن کامل

A Comprehension-Based Database Language and Its Distributed Execution

This paper describes a way to noticeably reduce the description cost of database operations executed in distributed computing environments by design of a novel declarative language to describe database operations, development of program transformation techniques to improve e ciency at execution time, and clari cation of prerequisites to execute the programs in distributed computing environments...

متن کامل

Verteilung globaler Anfragen auf heterogene Stromverarbeitungssysteme

Deployment of Global Queries in Distributed and Heterogeneous StreamProcessing Systems Distributed in-network stream processing is more efficient than sending all data to a central processing unit. In the past few years Stream-Processing Systems (SPSs) have established themselves as an interesting alternative to database systems for continuous query processing. There are many scenarios having w...

متن کامل

Research Statement Robert Soulé

With my research, I want to simplify the development of complex systems through the use of programming language technologies. Most of my work has focused on distributed stream processing [2, 3, 6, 7, 8], but I have also explored distributed storage systems [1], and security in peer-to-peer content distribution networks [4]. My methodology is to start by developing a formal model, and then to us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009